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Studies using massive, passively data collected from communication technologies have revealed 
many ubiquitous aspects of social networks, helping us understand and model social media, infor¬ 
mation diffusion, and organizational dynamics. More recently, these data have come tagged with 
geographic information, enabling studies of human mobility patterns and the science of cities. We 
combine these two pursuits and uncover reproducible mobility patterns amongst social contacts. 
First, we introduce measures of mobility similarity and predictability and measure them for pop¬ 
ulations of users in three large urban areas. We find individuals’ visitations patterns are far more 
similar to and predictable by social contacts than strangers and that these measures are positively 
correlated with tie strength. Unsupervised clustering of hourly variations in mobility similarity 
identifies three categories of social ties and suggests geography is an important feature to contextu¬ 
alize social relationships. We find that the composition of a user’s ego network in terms of the type 
of contacts they keep is correlated with mobility behavior. Finally, we extend a popular mobility 
model to include movement choices based on social contacts and compare it’s ability to reproduce 
empirical measurements with two additional models of mobility. 
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The rise of ubiquitous mobile computing has facilitated 
the generation, collection, and storage of massive data 
sets of human behavior. Social interactions are captured 
in calls, emails, and tweets, while movement is logged 
by check-ins and GPS traces HHJ. Studied separately, 
social and mobility data have produced a wealth of in¬ 
sights. Our understanding of information and diseases 
spread EH3, how our friends affect our well being mm, 
and how societies are structured nans] has been greatly 
improved by studying large social networks. Mobility 
data has revealed that human movement is regular, pre¬ 
dictable mm, and unique HU To complement em¬ 
pirical findings, a number of simple models have been 
proposed to reproduce the basic dynamics of both social 
networks [T9U2T] and mobility [3j [T71 [22j |23] , but the two 
have been traditionally treated as independent. 

Recognizing the interaction between social behavior 
and mobility, researchers began measuring correlations 
between the two. They found that social networks are 
heavily influenced by geography. We are far more likely 
to be friends with someone nearby than far away m , a 
fact that is useful for predicting missing links [25, i26j . 
With an estimated 15% to 30% of all trips taken for so¬ 
cial purposes, it is not surprising that the movement of 
our friends can improve predictions of where we will be 
next [ 22 I EH EH]. While insightful, the primary interest 
of most previous studies was measuring and reproducing 
patterns of geographic distance and it’s impact on net¬ 
work topologies [22j. In dense urban areas, however, dis¬ 
tance is less restrictive. Residents have access to a variety 


of transportation options and are free to choose locations 
that provide the best goods and services rather than the 
closest. The self-organized districts and neighborhoods of 
cities make it more natural to describe mobility as move¬ 
ment between sets of locations, or habitats [29 . Which 
habitats users share with their contacts and when they 
share them may indicate the nature of the social relation¬ 
ship: e.g. a coworker or a friend m ■ Two individuals co¬ 
located between 9am and 5pm on weekdays likely have a 
different relationship than two who are found in the same 
area at midnight on a Saturday. In these scenarios, mo¬ 
bility is defined and measured as discrete visits to places 
within a city that are shared with different types of social 
contacts at different times and previous work has shown 
that users who visit similar places are more likely to be 
friends in online location based social networks m- 

Here we describe a set of metrics to explicitly measure 
patterns of mobility and social behavior that occur within 
the context of cities. Using call detail records (CDRs) 
produced by millions of mobile phone users, we find that 
individuals have far more similar visitation patterns to 
social contacts than to strangers and that the movement 
of these contacts can be used to reconstruct a consid¬ 
erable portion of the individuals’ movements. We also 
find strong correlations between tie strength and mobil¬ 
ity similarity and show that mobility similarity can be 
used to classify social relationships and recover seman¬ 
tic information about the nature of a link in the social 
network. Finally, we propose an extension to the mobil¬ 
ity model described in m that incorporates movement 
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FIG. 1. A small sample of calls between residents is shown 
for each of three cities. CDRs provide the location of each 
caller as well as record of communication between then. A 
dot is drawn at the approximate location of a user and a link 
appears between two users calling each other. Our aim is to 
identify useful and reproducible patterns from this coupled 
tangle of social and spatial behavior. 

based on the visitation patterns of social contacts and 
can reproduce empirical relationships found in the data. 
We call this model the GeoSim model and compare it 
against empirical data and two other mobility models. 
The generality of these results is demonstrated by their 
reproducibility in three different cities in two different 
countries. This study presents advances in the under¬ 
standing of how social behavior affects our spatial choices 
in the context of information and communication tech¬ 
nologies (ICTs). 


MATERIALS AND METHODS 

Data 

Call detail records (CDRs) are generated when a mo¬ 
bile phone user performs an action that requires the 
provider’s network, for example placing a call or sending 
a text message. These records generally contain the ID 
of the tower the phone connected through, which gives a 
rough estimate of the user’s location. When the individ¬ 
ual receiving a call or message is a customer of the same 
provider, the unique identifier of the receiver and their 
location may also be stored. CDRs allow us to observe 
mobility patterns of individuals and construct social net¬ 
works containing millions of people. Figure [l] shows a 
small sample of calls between city residents during a sin¬ 
gle hour and illustrates dynamics of the urban system we 
wish to understand. 

Our data consist of anonymized CDRs collected from 
three cities (Rl, R2, and R3) in two different indus¬ 
trialized countries. Two cities (Rl and R2) were ob¬ 
tained from the same provider in country 1, while another 
provider was used for the third city (R3). The observa¬ 
tion period covers 15 months in Rl and R2 and 5 months 
in R3 and contains over 1 billion events in total. Each 
record provides the time of the communication event, an 
anonymous unique ID for the caller and callee, and the 


ID of the tower used by at least the caller (in the case of 
R3) and in some cases the callee (Rl and R2). More in¬ 
formation on the datasets can be found in the electronic 
supplementary materials (ESM). 

Social and Mobility Measurements 

In each city, we construct a social network containing 
all users (nodes) with sufficient call volume and connect 
users (edges) if they have regular contact between each 
other (see ESM for more detail). Each node is assigned 
a 48xL location matrix L, where L is the number of 
unique cell towers in the city. Each row of this matrix 
corresponds to an hour of a typical weekday and hour 
of a typical weekend day (giving 48 hours in total) and 
each element L t j contains the number of times that a 
user made a call from location j during hour t across 
the entire observation period (Figure |2]A). We refer to 
individual rows of this matrix v(t) as location vectors. 
The location matrix and location vectors can be used 
to compute various mobility properties of nodes (mobile 
phone users). Summing all elements of the location ma¬ 
trix gives the number of calls made and received by a user 
N = ■ L t j while summing each column and dividing 

by N provides the frequency of visits a user made to ev¬ 
ery location in the city, fj = -T L t j. Summing visits 
to each location at all times gives a single location vector 
v for each user and represents the total visits made to 
each location over the period of data collection. Apply¬ 
ing the sign function and summing across all elements of 
this vector provides the number of unique locations vis¬ 
ited S = J2j sign(vj). All of these features are measures 
of a user’s mobility behavior within the city. 

We can also compare the location matrices and vec¬ 
tors of two mobile phone users and measure similari¬ 
ties between the two. While a number of metrics could 
be used to measure mobility similarity between nodes 
(Figure [2^3), here we focus on the cosine similarity be¬ 
tween the location vectors of two nodes i and j de¬ 
fined as: cos | • The cosine similarity mea¬ 

sures the cosine of the angle between two vectors in our 
L-dimensional location space (Figure [2p). It has been 
shown to correlate strongly with the probability of being 
friends in an online social network m and has a number 
of desirable properties. It is sensitive to visit frequen¬ 
cies rather than set intersections alone, so two users who 
share frequently visited locations appear more similar 
than those who share less important destinations. Unlike 
the Pearson correlation coefficient, it does not overstate 
similarity when vectors contain many zero elements (as is 
often the case) and finally, the cosine similarity is a mea¬ 
sure of the angle only and is not affected by differences 
in the total number of calls made by two users. For the 
remainder of this paper, we refer to the cosine similarity 
between two locations vectors as mobility similarity. 
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FIG. 2. Similarity of visitation patterns between nodes in social networks. For each user, we keep track of (A) how many 
visits are made to locations across the city and (B) construct a social network by tracking calls to others. We can then define 
(C) the geographic cosine similarity between two users by computing the cosine of the angle between any two vectors in the 
location space. 


The mobility similarity between two users can be com¬ 
puted from their entire movement history or visits during 
a small portion of a weekday or weekend. In the former 
case, we assign a single mobility similarity value to an 
edge in the network, while in the latter, we assign a time 
series of cosine similarity cos 6(t) = \v-(t)\\v^{t )\ . This 
time series reveals how often two users visit the same 
places at a given time of the day and will later function 
as an attribute to differentiate between types of social 
contacts. 

Within this mathematical framework, we can calculate 
an upper bound on how much of an individual’s location 
vector can be reconstructed from a linear combination of 
the location vectors of other users. For example, a co¬ 
worker may share office space with an individual, but not 
live in the same neighborhood, while the opposite may be 
true for a member of that individual’s family. By com¬ 
bining the visitation patterns of the co-worker and family 
members, however, a complete picture of an individual’s 
visitation patterns can be obtained. Mathematically, we 
define a set of users F for each individual i in the network. 
For example, we may choose F to be neighbors in Fs ego 
network or a random set of nodes. The location vectors 
vj where j G F are used as columns of an |F| x L matrix 
we denote as A and span a subspace of the L-dimensional 
location space. We then use QR-decomposition to find 
an orthonormal basis B = q\,... ,q\p\ for A. Our target 
user’s location vector is then projected into this vector 
subspace: v = This projection represents 

the best approximation of a user’s visits based on the 
visits of users in F. We can quantify how it compares 
to a user’s true visitation patterns by taking the ratio of 
it’s magnitude with the magnitude of the actual location 


vector |v|. We refer to this ratio as predictability and 
define it mathematically as j^j. When predictability is 
1, the visitation frequencies of a user can be completely 
obtained from location vectors of users in F and when 
it is 0, nothing about their visits can be learned. We 
note that for values between 0 and 1, predictability can¬ 
not be interpreted as the fraction of a user’s visits that 
can be recovered as the vector norms are computed using 
the standard L2 norm. In principal, however, these two 
quantities should be strongly correlated because the in¬ 
dividual elements location vectors can never be negative. 

We next apply these methods and metrics to social 
network and mobility data from three cities. 

RESULTS 

Correlations between social behavior and mobility 

Though similarity can be measured between any two 
arbitrary nodes and predictability from an arbitrary set 
of nodes F, we hypothesize that an individual will likely 
be more similar to and predictable by social contacts. 
To test this, we compare the mobility similarity between 
users that call each other regularly with the similarity 
between random users and the predictability achieved 
using a node’s social ties with the predictability using 
random sets of nodes (essentially rewiring the social net¬ 
work, but leaving mobility intact). Figures [3]A and [3^3 
show the distribution of similarity and predictability val¬ 
ues for the networks in each city. We find significantly 
more similarity and predictability in empirical networks 
when compared to random re-wirings. The similarity dis¬ 
tribution is bimodal, with peaks at very low similarity 
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FIG. 3. Correlations between mobility and social behavior. For each city, we compute the (A) distribution of cosine similarity 
and (B) predictability using observed edges (colored lines) and compare to distributions made using randomized edges. We find 
both mobility similarity and predictability are much higher when using actual social contacts compared to random users. Social 
similarity is also correlated with mobility similarity. (C) Ranking each user’s contacts by number of calls, we find that stronger 
ties are more geographically similar. (D) Moreover, the more common contacts shared by two users, the more geographically 
similar those individuals tend to be. Finally, we explore how social behavior is correlated with mobility. (E) We find that users 
with more unique contacts tend to visit more unique locations. (F) Users who distribute their calls to contacts more evenly 
(higher entropy) are more predictable than users with more uneven call distributions. This suggests that users who share social 
attention more evenly also share locations. Figure S2 and S3 in the ESM show these results controlling for call frequency. 


near 0 and very high similarity near 1. We measure very 
high values of predictability when using an individual’s 
social contacts as opposed to a random set of people in 
the same city. As other studies have suggested, we find 
that visitation patterns are strongly linked to our social 
relationships; our movements are far more similar to our 
social contacts than random users. 

Interestingly, we observe higher levels of mobility sim¬ 
ilarity between users separated by short network dis¬ 
tances. We find that two connected nodes are on av¬ 
erage 10 times more geographically similar that two ran¬ 
domly selected nodes. Nodes separated by two hopes, or 
“friends of friends”, are nearly twice as similar as ran¬ 
domly selected nodes and this elevated similarity is ob¬ 
served up to three hops from an individual (see ESM 
Figure S5 for details). This result is expected as two 
users who do not contact each other may both visit the 
same friend. 


Next, we explore the relationship between tie strength 
and mobility similarity. We rank all contacts in each 
user’s ego network by the number of calls shared be¬ 
tween them (1 being contact that shares the most calls) 
and compute the average mobility similarity for all edges 
with a given rank (Figure [3]C). Stronger contacts have 
higher mobility similarity on average than weaker ties, 
though this effect subsides for contacts below rank 10. 
We note that region R3 shows a slightly different trend. 
This is likely due to the shorter observation period in 
this region resulting in few individuals with more than 
10 regular contacts, biasing the tail of this distribution 
(see ESM for more details). We also observe a positive 
correlation between social similarity as measured by the 
Jaccard index between the neighbors of two nodes and 
mobility similarity (Figure [3p); individuals who share 
more social contacts share more locations. 

We also find other aspects of social behavior to be cor- 
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related with mobility. Individuals with more friends tend 
to visit more locations, but despite this exploratory be¬ 
havior, are still more predictable due to increased infor¬ 
mation provided by additional contacts to reconstruct 
these movements from (Figure |3^). Again R3 appears as 
an outlier due to the shorter observation period and the 
absence of mobility information on the user receiving a 
call. We then measure the entropy of the distribution 
of frequencies that a user i calls another contact j and 
find that individuals with more entropic calling patterns 
(distribute their calls more evenly) also visit more unique 
places and are more predictable (Figure |3^). The visita¬ 
tions patterns of those who spread social attention more 
evenly can be more easily reproduced. Finally, to en¬ 
sure that these results are not an artifact of sampling 
frequencies, we compute these distributions and correla¬ 
tions controlling for the number of CDR events by and 
the degree of a user, finding no change in the relation¬ 
ships (see Figures SI, S2, and S3 in the ESM). 

Contextualizing social contacts with mobility 

Having demonstrated that social behavior and loca¬ 
tion choices are strongly correlated, we next use temporal 
variations in mobility similarity to provide context into 
the type of social relationship between two individuals 
in our networks. We measure mobility similarity cos 6(t) 
over the course of a typical weekday and weekend un¬ 
der the hypothesis that different types of social contacts 
will have different levels of similarity at different times. 
To identify any groups, we use a simple k-means unsu¬ 
pervised clustering algorithm on these similarity time se¬ 
ries. We find three persistent groups. While we have 
no ground truth data about the nature of these relation¬ 
ships, for clarity, we label each group according to it’s 
qualitative signature: ( i ) acquaintances with uniformly 
low levels of similarity, ( ii ) co-workers with high similar¬ 
ity during work hours on weekdays and low similarity on 
nights and weekends, and (in) family/friends with high 
similarity on nights and weekends. Figure |4]4 shows the 
cluster centers for each group. While other interesting 
clusters are found for k > 3, they appear as subgroups 
of the three general archetypes we discuss here. More 
information on the clustering method along with results 
for different numbers of clusters and different clustering 
methods can be found in the ESM. These three groups 
appear in each city despite the unsupervised nature of 
the algorithm; cluster centers start at random locations, 
yet find remarkably similar final positions in each city. 

Assigning each edge to a cluster based on the time se¬ 
ries of mobility similarity effectively paints all edges in 
the next in a specific color as illustrated above in Fig¬ 
ure]^. Previous work has found that edges in real social 
networks are much more likely to be arranged in trian¬ 
gles, resulting in high clustering coefficients. In this case, 


we expect that some social groups, such as co-workers or 
close friends, should exhibit high degrees of intra-group 
clustering, while others such as acquaintances do not. For 
example, many of an individual’s co-workers visit simi¬ 
lar places during work hours and tend to call each other 
because they are part of the same office community. We 
find evidence of this when measuring the clustering coef¬ 
ficient within subgraphs containing only edges belonging 
to a single mobility similarity cluster (Figure [4^3). Inter¬ 
estingly, the clustering coefficient (C g ) of acquaintances 
is much lower than the co-workers and family ties de¬ 
spite consisting of nearly 70% of links in the network. 
This provides additional evidence that we are capturing 
very different types of relationships with our classifica¬ 
tions based on mobility similarity. Moreover, these re¬ 
sults highlight mobility similarity as a property to label 
functional communities within social networks as well as 
individual edges. 

Next, we consider how the composition of an individ¬ 
ual’s ego-network correlates with their mobility. Is a 
person with a stable job and family is likely to be less 
exploratory and more predictable than a young college 
student with many acquaintances? To answer this, we 
bin nodes into groups based on two mobility metrics, the 
number of unique locations visited S and how predictable 
that user is j^j. We then compute the fraction of edges 
that belong to each classification for all nodes in each 
mobility bin. Figure [4p shows that users who tend to 
visit more unique locations tend to have a higher frac¬ 
tions of acquaintances in their ego network, while Figure 
EP suggests that less predictable individuals tend to have 
fewer contacts in this category. Conversely, less spatially 
explorative individuals and individuals that are easier to 
predict tend to have higher fraction of co-workers and 
family/friends labels in their ego network. These results 
again show the ability of mobility similarity to add con¬ 
textual attributes to a network and reveal novel relation¬ 
ships between the structure of a user’s ego network and 
their mobility behavior. In future works, it may be in¬ 
teresting to explore correlations between the mix of one’s 
ego network and social behaviors such as their propensity 
to form new contacts m- 

Coupling social ties and mobility 

Given the clear empirical relationship between social 
contacts and mobility, our remaining task is to identify 
a coupled model that captures these dynamics. While 
a number of models consider mobility alone [2j \17\ [23], 
only a few have attempted to link the two [22] [27]. Those 
that have combined social and mobility behaviors have 
consistently found nearly 15-30% of trips are made for 
social purposes. Though these coupled modeled have had 
considerable success reproducing patterns of geographic 
distance within social network structure, but, as we show, 
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FIG. 4. Characterizing social ties based on similarity of 
movement over time. (A) We perform k-means clustering on 
the set of similarity time series from edges in the network. We 
find three groups emerge in each city: (i) acquaintances who 
have low levels of similarity across all times, (ii) co-workers 
who have elevated similarity during work hours on weekdays, 
but lower levels on weekends, and (in) family/friends who 
have high similarity on nights and weekends. (B) For each 
city we construct subgraphs containing only edges in a single 
cluster. We find that these subgraphs retain high clustering 
coefficient (C g ) within the co-worker and family/friend group 
while acquaintances are far less likely to have ties among each 
other. Finally, we explore how an user’s behavior correlates 
with the mobility characteristics of their immediate social net¬ 
work. (C-D) We group nodes based on their mobility char¬ 
acteristics (unique locations visited S and predictability j^j) 
then compute the fraction of edges that belong to each of 
the identified clusters for each node in the group. Individuals 
that are more exploratory (visit more unique places) tend to 
have higher fractions of acquaintances ties than individuals 
with lower mobility while the reverse trend is observed for 
the most predictable individuals. 


do not always capture properties of geographic similarity. 

In light of the time scales we are studying, we make 
the assumption that our social network is static and ex¬ 
tend the mobility model introduced by Song et al. m 
to include movement choices based on social contacts. 
We call our extension the GeoSim model [32]. We com¬ 
pare our model to the original individual-mobility model 
(IM model) by Song et al. and the Travel-Friendship 
model (TF model) described by Grabowicz et al. See 
ESM for more details on implementation and parame¬ 
ters for model comparisons. 

The GeoSim model works as follows: first, a population 
of N agents are initialized and connected to replicate 
the undirected social network constructed from the CDR 
data in Rl. Each edge that exists in the call data, exists 
in the model, but all weights and similarities are set to 0. 
Agents are randomly assigned to a location at the start 


and their location vectors are initialized to reflect this 
single visit. They are allowed to move in a discrete space 
of L locations replicating the towers from CDRs. 

Each time step corresponds to a single hour of the day. 
At each time step, individuals decide whether or not to 
change locations according the waiting time distribution 
measured in H3, a power-law with an exponential cutoff 
p(At) = exp(At/r) where (3 = 0.8 and r = 17 

hours. If an individual moves, they must decide to either 
return to a previously visited location with probability 
1 — pS 7 or explore and visit a new one with probabil¬ 
ity where S is the number of unique locations they 
have visited thus far and p = 0.6 and y = 0.6 are parame¬ 
ters chosen by procedures outlined in[T7]. In the original 
model, an individual u preferentially returns to a loca¬ 
tion l with probability proportional to the frequency of 
previous visits, P(l ) oc / z w and new locations to explore 
are chosen uniformly at random (note that in our version 
of the model distance is irrelevant). 

In our extension of this model, we choose some loca¬ 
tions based on social influence. When picking a return lo¬ 
cation, our agent has two possibilities. With probability 
1 — <a, they select a return location with the preference for 
locations they have visited in the past as in the original 
model. With probability a a social contact v is chosen. 
The probability a given contact is chosen is directly pro¬ 
portional to the current mobility similarity between the 
two, P(v) oc cos (0 U , V ) and a location to visit is chosen 
based on a preference to visit locations frequented by the 
selected contact, P(l) oc ff (note the location choice is 
repeated until an agent finds a location they have visited 
before). In the social case, this amounts to preferential 
return based on a contact’s visit frequency as opposed 
to the ego’s visits. In the event that an agent is ex¬ 
ploring a new location, the same weighted social coin is 
flipped. This time, though, with probability 1-aa ran¬ 
dom, previously unvisited location is selected and with 
probability a the agent again chooses a contact based on 
mobility similarity and chooses a new place to visit based 
on the visit frequencies of that contact. The cosine sim¬ 
ilarity across all edges is computed and updated over as 
the model progresses and changes dynamically during the 
simulation. A schematic of this process can be found in 
Figure [5] 

In this variant of the mobility model, the parameter 
a controls the influence of social contacts on the visita¬ 
tion patterns of individuals. When a = 0, we recover 
the original mobility model of El, while when a = 1 all 
location choices are influenced by social ties. In reality, 
each user may have an inherent value of a that we cannot 
observe. To incorporate this heterogeneity, we simulate 
this model for a number of distributions of the parameter 
a. We find an exponentially distributed a with a mean of 
(a) =0.2 produces a close fit to distributions of mobility 
similarity and predictability observed in the population 
and refer the reader to the ESM for results for different 
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distributions of a. This value is consistent with the re¬ 
sults of both Cho et al. m and Grabowicz et al. [22] 
who find that roughly 15-30% of trips were motivated by 
social intentions. 


Movement 



P(l) oc P(y) (X cos(6 UjV ) *7(0, L) P(y ) oc cos(0 Ujt> ) 

P(l) oc ’ P(l) oc JT 

FIG. 5. A schematic description of the GeoSim model. As in 
the IM model presented by Song et al., individuals first decide 
whether to return to a previously visited location or explore 
a new location. The actual choice of location to visit, new 
or returning, is made based on either a social influence with 
probability a or individual preference with probability 1 — a. 

Having found an appropriate distribution for a, we 
next compare simulation results with this distribution 
to results from the IM model (equivalent to the GeoSim 
model with a = 0) and the TF model all run for the 
same 1 year duration and populations size. Like the IM 
model it extends, the GeoSim model is able to reproduce 
elements of individual mobility such as the rate of ex¬ 
ploration of new locations S(t) over time (Figure |6]A) as 
well as frequency at which users visit their locations fk 
(Figure [6^3). Here the TF model adequately reproduces 
exploration rates, but produces a flatter visit frequency 
distribution. In the case of mobility similarity and pre¬ 
dictability, however, only the GeoSim model reproduces 
observed behavior (Figure j6p-D). Interestingly, the TF 
model results in relatively high predictability of users, 
despite similarity values orders of magnitude lower than 
those observed in the data or with the IM model. This is 
likely due to the flattened frequency distribution which 
the cosine similarity is highly sensitive to. Even if two 
users share a few locations due the friendship component 
of the TF model, there are preferential dynamics that 
will continually bring those two users back to that place, 
increasing cosine similarity. On the other hand, this flat 
frequency distribution makes it highly likely that users 
will share at least some locations in commons with each 


other, making it possible to reproduce location vectors 
based on social contacts. Despite it’s inability to re¬ 
cover these distributions, the TF model is the only model 
tested that builds a social network endogenously. For 
this reason, we hope future work will find variants on 
this model capable of dynamically reproducing empirical 
data of both social and mobility behavior. 
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FIG. 6. Comparing social mobility models. A) We compare 
model results simulating the rate of exploration S(t) com¬ 
pared to empirical data. While all three models appear to es¬ 
timate more absolute locations visited, the rate of this growth 
is consistent between them and in-line with data. B) For each 
user, we sort locations based on the number visits and com¬ 
pute the frequency that a user visits a location of rank k. We 
find that the IM models and our extension to it reproduce 
this distribution well, while the TF model is much flatter, 
distributing visits more evenly over all locations. C) Only 
the GeoSim model is able to reproduce patterns of mobility 
similarity and D) predictability. The TF model results shown 
in the inset in C shows similarity values orders of magnitude 
below the observed data. As the similarity is heavily influ¬ 
enced by the frequency distribution of visits, this deviation is 
likely due to the flatter distribution of fk produced by the TF 
model. 


DISCUSSION 

Linking mobility to social ties has generated a number 
of insights into the dynamics of both. Social networks 
are embedded in geography where face-to-face interac¬ 
tions are often preferred and chance of interacting with 
those nearby is greatest. At the same time, we are will¬ 
ing to travel to achieve this proximity and rendezvous at 
places across the city for work and play. Novel high res¬ 
olution data sets passively collected from mobile, online 










devices now enable us to quantify the correlation between 
mobility similarity and social behavior. Here we have 
offered new metrics and empirical findings that relate 
social behaviors to mobility similarity and predictabil¬ 
ity. Our results show that our mobility is far more sim¬ 
ilar to our social contacts than strangers and that this 
similarity can be used to reconstruct our own mobility 
patterns. We find strong, positive correlations between 
tie strength and mobility similarity. Moreover, temporal 
variations in this similarity reveal three distinct groups 
of social ties that hint at semantic types of relationships 
such as co-worker or family member. These subgraphs 
often have high levels of intra-group clustering, suggest¬ 
ing functional groups of individuals within the network. 
The mix of these groups amongst the edges of an in¬ 
dividual’s ego network is correlated with their mobility 
behavior; users with many dissimilar contacts tend to ex¬ 
plore more locations. Speaking to their generalizability, 
these results persist across three different cities in two 
countries. 

Finally, we extended an established mobility model to 
include choices based on social behavior that replicates 
the empirical findings described here as well as from other 
works. We call this model the GeoSim model and have 
compared its results to two similar models. We hope that 
this model provides a useful tool for future work in the 
area. The findings presented have a number of implica¬ 
tions for those interested in social networks or mobility 
applications extracted from ICTs. Additional contextual 
information of relationships may help predict missing 
links or provide critical details to more accurately model 
of the flows of information or diseases. Urban planners or 
those needing good estimates of travel demand can incor¬ 
porate social mechanisms like the ones described here to 
improve on their models and to capture movements pre¬ 
viously unaccounted for. Robust findings that classify 
social contacts from passive data alone may influence fu¬ 
ture studies and help with data informed policies through 
city science. In the new data rich reality of cities, deeper 
insight into the connections between us will help make 
the places we live more sustainable, efficient, productive, 
and fun. 
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